Pruning Fuzzy ARTMAP using the Minimum Description Length Principle in Learning from Clinical Databases
نویسندگان
چکیده
Fuzzy ARTMAP is one of the families of the neural network architectures bused on ART(Adaptive Resonance Theory) in which supervised learning can be curried out. However, it usually tends to create more categories than are actually needed. This often causes the so culled overfitting problem, namely the performunce of the networks in test set is not monotonically increasing with the additional training epochs and cutegory creation, j&-fuzzy ARTMAP. In order to avoid the overfitting problem, Carpenter und Tun [Carpenter and Tan, 19931 proposed a confidence-based pruning method by eliminating those categories that were either less useful or less accurate. This puper proposes yet another alternative pruning method thut is bused on the Minimal Description Length (MDL) principle. The MDL principle can be viewed as a tradeoflbetween theory complexity and data prediction accurucy given the theory. We adopted CumeronJones’ error encoding scheme and Quinlan’s mod@er for theory encoding to estimate the fuzzy ARTMAP theory description length. A greedy search algorithm of the minimum description length to prune the fiizzy ARTMAP categories one by one is proposed. The experiments sh.owed that fuzzy ARTMAY pruned with the MDL principle gave better performance with .fur fewer categories created than the original fuzzy ARTMAP and other machine learning systems on a number of benchmark clinical dutabases such as heart disease, breath cancer and diabetes databases. (Subject Area: Neural Networks; Knowledge Acquisition and Machine Learning)
منابع مشابه
Rule Extraction, Fuzzy ARTMAP, and Medical Databases
This paper shows how knowledge, in the form of fuzzy rules, can be derived from a. self-organizing supervised learning neural network called fuzzy ARTMAP. Rule extraction proceeds in two stages: pruning removes those recognition nodes whose confidence index falls below a selected threshold; and quantization of continuous learned weights allows the final system state to be translated into a usab...
متن کاملPipelining of Fuzzy ARTMAP without matchtracking: Correctness, performance bound, and Beowulf evaluation
Fuzzy ARTMAP neural networks have been proven to be good classifiers on a variety of classification problems. However, the time that Fuzzy ARTMAP takes to converge to a solution increases rapidly as the number of patterns used for training is increased. In this paper we examine the time Fuzzy ARTMAP takes to converge to a solution and we propose a coarse grain parallelization technique, based o...
متن کاملUnsupervised Transduction Grammar Induction via Minimum Description Length
We present a minimalist, unsupervised learning model that induces relatively clean phrasal inversion transduction grammars by employing the minimum description length principle to drive search over a space defined by two opposing extreme types of ITGs. In comparison to most current SMT approaches, the model learns a very parsimonious phrase translation lexicons that provide an obvious basis for...
متن کاملCross-validation in Fuzzy ARTMAP for large databases
In this paper we are examining the issue of overtraining in Fuzzy ARTMAP. Over-training in Fuzzy ARTMAP manifests itself in two different ways: (a) it degrades the generalization performance of Fuzzy ARTMAP as training progresses; and (b) it creates unnecessarily large Fuzzy ARTMAP neural network architectures. In this work, we are demonstrating that overtraining happens in Fuzzy ARTMAP and we ...
متن کاملProbalistic Network Construction Using the Minimum Description Length Principle
Probabilistic networks can be constructed from a database of cases by selecting a network that has highest quality with respect to this database according to a given measure. A new measure is presented for this purpose based on a minimum description length (MDL) approach. This measure is compared with a commonly used measure based on a Bayesian approach both from a theoretical and an experiment...
متن کامل